Pointwise probability reinforcements for robust statistical inference

نویسندگان

  • Benoît Frénay
  • Michel Verleysen
چکیده

Statistical inference using machine learning techniques may be difficult with small datasets because of abnormally frequent data (AFDs). AFDs are observations that are much more frequent in the training sample that they should be, with respect to their theoretical probability, and include e.g. outliers. Estimates of parameters tend to be biased towards models which support such data. This paper proposes to introduce pointwise probability reinforcements (PPRs): the probability of each observation is reinforced by a PPR and a regularisation allows controlling the amount of reinforcement which compensates for AFDs. The proposed solution is very generic, since it can be used to robustify any statistical inference method which can be formulated as a likelihood maximisation. Experiments show that PPRs can be easily used to tackle regression, classification and projection: models are freed from the influence of outliers. Moreover, outliers can be filtered manually since an abnormality degree is obtained for each observation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generalized Species Sampling Priors with Latent Beta reinforcements

Many popular Bayesian nonparametric priors can be characterized in terms of exchangeable species sampling sequences. However, in some applications, exchangeability may not be appropriate. We introduce a novel and probabilistically coherent family of non-exchangeable species sampling sequences characterized by a tractable predictive probability function with weights driven by a sequence of indep...

متن کامل

Uniform consistency in causal inference

There is a long tradition of representing causal relationships by directed acyclic graphs (Wright, 1934). Spirtes (1994), Spirtes et al. (1993) and Pearl & Verma (1991) describe procedures for inferring the presence or absence of causal arrows in the graph even if there might be unobserved confounding variables, and/or an unknown time order, and that under weak conditions, for certain combinati...

متن کامل

Accurate Inference for the Mean of the Poisson-Exponential Distribution

Although the random sum distribution has been well-studied in probability theory, inference for the mean of such distribution is very limited in the literature. In this paper, two approaches are proposed to obtain inference for the mean of the Poisson-Exponential distribution. Both proposed approaches require the log-likelihood function of the Poisson-Exponential distribution, but the exact for...

متن کامل

Foundations of Multivariate Inference Using Modern Computers

Fisher analytically structured pivot functions (PFs) whose distribution does not suggested in 1930's depend on unknown parameters. These pivots provided a foundation for (asymptotic) statistical inference. Anderson (1958, p. 116) introduced the concept of a critical function of observables, which finds the rejection probability of a test for Fisher's pivot. Vinod (1998) shows that Godambe's (19...

متن کامل

Robust EM Continual Reassessment Method in Oncology Dose Finding.

The continual reassessment method (CRM) is a commonly used dose-finding design for phase I clinical trials. Practical applications of this method have been restricted by two limitations: (1) the requirement that the toxicity outcome needs to be observed shortly after the initiation of the treatment; and (2) the potential sensitivity to the prespecified toxicity probability at each dose. To over...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Neural networks : the official journal of the International Neural Network Society

دوره 50  شماره 

صفحات  -

تاریخ انتشار 2014